Skip to content

[WIP] Address feedback on ROCM-20519: replace racy pre-sync host read with hipStreamQuery#4256

Draft
Copilot wants to merge 2 commits intoROCM-20519from
copilot/sub-pr-4250
Draft

[WIP] Address feedback on ROCM-20519: replace racy pre-sync host read with hipStreamQuery#4256
Copilot wants to merge 2 commits intoROCM-20519from
copilot/sub-pr-4250

Conversation

Copy link
Contributor

Copilot AI commented Mar 20, 2026

Unit_hipDeviceSynchronize_Functional used a host-memory read to assert the device was still busy before calling hipDeviceSynchronize(). That read raced with the in-flight D2H hipMemcpyAsync writing to the same pinned buffer — correctness depended entirely on the copy being slow enough to not finish first.

Motivation

Eliminate the data race in the pre-sync assertion of Unit_hipDeviceSynchronize_Functional. Reading shared host memory while a D2H async copy is in flight is undefined behavior; it was only masked by inflating the copy size via HIP_TEST_DEVICE_SYNCHRONIZE_FUNCTIONAL_COPY_MB, making the test fragile across fast GPUs and non-default HIP_LAUNCH_BLOCKING settings.

Technical Details

  • projects/hip-tests/catch/unit/device/hipDeviceSynchronize.cc
    • Removed the racy REQUIRE(NUM_ITERS != A[NUM_STREAMS - 1][0] - 1) pre-sync assertion and its associated workaround comment.
    • Replaced with a hipStreamQuery loop that checks for hipErrorNotReady — confirms at least one stream still has pending work without touching any shared host buffer:
// Before (racy): reads host buffer concurrent with D2H copy
REQUIRE(NUM_ITERS != A[NUM_STREAMS - 1][0] - 1);

// After: race-free stream status check
bool anyStreamBusy = false;
for (int i = 0; i < NUM_STREAMS; i++) {
  if (hipStreamQuery(stream[i]) == hipErrorNotReady) {
    anyStreamBusy = true;
    break;
  }
}
REQUIRE(anyStreamBusy);
  • The post-sync correctness check (REQUIRE(NUM_ITERS == A[NUM_STREAMS - 1][0] - 1)) is unchanged.

JIRA ID

ROCM-20519

Test Plan

Unit_hipDeviceSynchronize_Functional — verify the test passes and no longer requires oversized copies to mask timing races.

Test Result

No regressions observed in the modified test case.

Submission Checklist


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: jaydeeppatel1111 <106300970+jaydeeppatel1111@users.noreply.github.com>
Copilot AI changed the title [WIP] [WIP] Address feedback on ROCM-20519 for DeviceSynchronize race stabilization [WIP] Address feedback on ROCM-20519: replace racy pre-sync host read with hipStreamQuery Mar 20, 2026
Copilot AI requested a review from jaydeeppatel1111 March 20, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants